Search CORE

15 research outputs found

Learning to Search in Reinforcement Learning

Author: Antonoglou Ioannis
Publication venue: UCL (University College London)
Publication date: 28/04/2023
Field of study

In this thesis, we investigate the use of search based algorithms with deep neural networks to tackle a wide range of problems ranging from board games to video games and beyond. Drawing inspiration from AlphaGo, the first computer program to achieve superhuman performance in the game of Go, we developed a new algorithm AlphaZero. AlphaZero is a general reinforcement learning algorithm that combines deep neural networks with a Monte Carlo Tree search for planning and learning. Starting completely from scratch, without any prior human knowledge beyond the basic rules of the game, AlphaZero managed to achieve superhuman performance in Go, chess and shogi. Subsequently, building upon the success of AlphaZero, we investigated ways to extend our methods to problems in which the rules are not known or cannot be hand-coded. This line of work led to the development of MuZero, a model-based reinforcement learning agent that builds a deterministic internal model of the world and uses it to construct plans in its imagination. We applied our method to Go, chess, shogi and the classic Atari suite of video-games, achieving superhuman performance. MuZero is the first RL algorithm to master a variety of both canonical challenges for high performance planning and visually complex problems using the same principles. Finally, we describe Stochastic MuZero, a general agent that extends the applicability of MuZero to highly stochastic environments. We show that our method achieves superhuman performance in stochastic domains such as backgammon and the classic game of 2048 while matching the performance of MuZero in deterministic ones like Go

UCL Discovery

Playing Atari with Deep Reinforcement Learning

Author: Antonoglou Ioannis
Graves Alex
Kavukcuoglu Koray
Mnih Volodymyr
Riedmiller Martin
Silver David
Wierstra Daan
Publication venue
Publication date: 01/01/2013
Field of study

We present the first deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning. The model is a convolutional neural network, trained with a variant of Q-learning, whose input is raw pixels and whose output is a value function estimating future rewards. We apply our method to seven Atari 2600 games from the Arcade Learning Environment, with no adjustment of the architecture or learning algorithm. We find that it outperforms all previous approaches on six of the games and surpasses a human expert on three of them.Comment: NIPS Deep Learning Workshop 201

arXiv.org e-Print Archive

CiteSeerX

UCL Discovery

Leaf age-dependent effects of foliar-sprayed CuZn nanoparticles on photosynthetic efficiency and ROS generation in <i>Arabidopsis thaliana</i>

Author: Adamakis Ioannis Dimosthenis S.
Antonoglou Orestis
Dendrinou-Samara Catherine
Moustaka Julietta
Moustakas Michael
Sperdouli Ilektra
Publication venue: 'MDPI AG'
Publication date: 01/08/2019
Field of study

Young and mature leaves of Arabidopsis thaliana were exposed by foliar spray to 30 mg L−1 of CuZn nanoparticles (NPs). The NPs were synthesized by a microwave-assisted polyol process and characterized by dynamic light scattering (DLS), X-ray diffraction (XRD), and transmission electron microscopy (TEM). CuZn NPs effects in Arabidopsis leaves were evaluated by chlorophyll fluorescence imaging analysis that revealed spatiotemporal heterogeneity of the quantum efficiency of PSII photochemistry (ΦPSΙΙ) and the redox state of the plastoquinone (PQ) pool (qp), measured 30 min, 90 min, 180 min, and 240 min after spraying. Photosystem II (PSII) function in young leaves was observed to be negatively influenced, especially 30 min after spraying, at which point increased H2O2 generation was correlated to the lower oxidized state of the PQ pool. Recovery of young leaves photosynthetic efficiency appeared only after 240 min of NPs spray when also the level of ROS accumulation was similar to control leaves. On the contrary, a beneficial effect on PSII function in mature leaves after 30 min of the CuZn NPs spray was observed, with increased ΦPSΙΙ, an increased electron transport rate (ETR), decreased singlet oxygen (1O2) formation, and H2O2 production at the same level of control leaves.An explanation for this differential response is suggested

Multidisciplinary Digital Publishing Institute

Copenhagen University Research Information System